Simon Frost (@sdwfrost), 2023-02-01
Pathogen.jl, described in this paper, is a package to simulate transmission network individual-level models, in which individuals can have their own risk factors that affect their contact with others, their susceptibility, their infectivity on infection, their recovery rates, etc.. It includes methods for stochastic simulation and Bayesian inference of SEIR, SEI, SIR, and SI individual level models. This tutorial is based on that included in the Pathogen.jl package, available here, but simplified to match the parameter values used in this repository as closely as possible, in order to generate comparable results. This tutorial focuses on inference; for computational purposes, we consider a smaller population (N=100) than in most of the other examples; this also makes visualization of the transmission network clearer. The random number seed was chosen so as to ensure than an epidemic did not fade out immediately due to the small number (1) of initial infected individuals.
using Random using Distributions using Pathogen using Plots using Plots.PlotMeasures using StatsPlots;
We set the random number seed for reproducibility.
Random.seed!(1);
The setting up and running of the model is described in more detail in the simulation example.
N = 100 I₀ = 1 locations = DataFrame(x = rand(Uniform(0, 10), N), y = rand(Uniform(0, 10), N)) # Precalculate distances dists = [1.0 for i = 1:N, j = 1:N] # Set diagonal to zero [dists[i,i] = 0.0 for i in 1:N] pop = Population(locations, dists) function _constant(params::Vector{Float64}, pop::Population, i::Int64) return params[1] end function _one(params::Vector{Float64}, pop::Population, i::Int64) return 1.0 end function _zero(params::Vector{Float64}, pop::Population, i::Int64) return 0.0 end function _one(params::Vector{Float64}, pop::Population, i::Int64, k:: Int64) return 1.0 end rf = RiskFunctions{SIR}(_zero, # sparks function _one, # susceptibility function _one, # infectivity function: defines a distance _constant, # transmissability function _constant) # removal function # Parametrize risk functions for simulation rparams = RiskParameters{SIR}(Float64[], # sparks function parameter(s) Float64[], # susceptibility function parameter(s) Float64[], # infectivity function parameter(s) [0.5/N], # transmissibility function parameter(s) βc/N [0.25]) # removal function parameter(s) γ # Set starting states in population # Set first I₀ individuals as infectious, others as susceptible to start starting_states = [fill(State_I, I₀); fill(State_S, N-I₀)] # Initialize simulation sim = Simulation(pop, starting_states, rf, rparams) # Simulate simulate!(sim, tmax=40.0);
p1 = plot(sim.events, 0.0, 40.0, legendfont=font(6), xaxis=font(10), bottom_margin=30px) # Population/TransmissionNetwork plots p2=plot(sim.transmission_network, sim.population, sim.events, 0.0, title="Time = 0", titlefontsize = 8) p3=plot(sim.transmission_network, sim.population, sim.events, 10.0, title="Time = 10", titlefontsize = 8) p4=plot(sim.transmission_network, sim.population, sim.events, 20.0, title="Time = 20", titlefontsize = 8) p5=plot(sim.transmission_network, sim.population, sim.events, 30.0, title="Time = 30", titlefontsize = 8) p6=plot(sim.transmission_network, sim.population, sim.events, 40.0, title="Time = 40", titlefontsize = 8) # Combine l = @layout [a grid(1,5)] combinedplots1 = plot(p1, p2, p3, p4, p5, p6, layout=l)
We first generate exact observations for infection and removal; the observe function from Pathogen.jl expects a UnivariateDistribution from Distributions.jl.
obs = observe(sim, Dirac(0.0), Dirac(0.0), force=true);
Next, we specify some priors for the risk parameters of our various risk functions, using UnivariateDistribution[] for fixed parameters.
rpriors = RiskPriors{SIR}(UnivariateDistribution[], UnivariateDistribution[], UnivariateDistribution[], [Uniform(0.001, 0.01)], [Uniform(0.1, 0.5)]);
We then set some extents for event data augmentation
ee = EventExtents{SIR}(0.001, 0.001);
The Markov Chain Monte Carlo (MCMC) is initialized as follows. Multiple attempts are needed to get initial parameter values; larger numbers of attempts are needed for larger populations.
mcmc = MCMC(obs, ee, pop, starting_states, rf, rpriors) start!(mcmc, attempts=10000); # 1 chain, with 10k initialization attempts
Assuming that the above has successfully initialized the Markov chain, the MCMC can be run as follows. Here, we use 11000 iterations, including a 1000 sample burn-in period, which should take a few minutes to run.
niter = 11000 burnin = 1000 thin = 20 iterate!(mcmc, niter, 1.0, condition_on_network=true, event_batches=5);
The summary function provides a table of the estimated parameters (the first is transmissibility, the second is recovery/removal).
summary(mcmc, burnin=burnin, thin=thin)
| Row | parameter | mean | var | CI |
|---|---|---|---|---|
| String | Float64 | Float64 | Tuple… | |
| 1 | Ωt₁ | 0.0048973 | 3.10216e-7 | (0.00392078, 0.00612848) |
| 2 | Ωr₁ | 0.229968 | 0.000595299 | (0.18608, 0.279817) |
These parameter values can be extracted and visualized as follows.
β = [mcmc.markov_chains[1].risk_parameters[i].transmissibility[1] for i in (burnin+1):niter] γ = [mcmc.markov_chains[1].risk_parameters[i].removal[1] for i in (burnin+1):niter];
marginalkde(β, γ)
Note that the removal rate estimates are centered close to the true value, while the inferred transmissibility parameter is an underestimate of the true value (although this is only for a single realization).
Here are the MCMC posterior plots, which show good mixing of the Markov chains.
# MCMC trace plot p1 = plot(1:thin:niter, mcmc.markov_chains[1].risk_parameters, yscale=:log10, title="TN-ILM parameters", xguidefontsize=8, yguidefontsize=8, xtickfontsize=7, ytickfontsize=7, titlefontsize=11, bottom_margin=30px) # State plots p2 = plot(mcmc.markov_chains[1].events[niter], State_S, linealpha=0.01, title="S", xguidefontsize=8, yguidefontsize=8, xtickfontsize=7, ytickfontsize=7, titlefontsize=11) for i=(burnin+1):thin:niter plot!(p2, mcmc.markov_chains[1].events[i], State_S, linealpha=0.01) end plot!(p2, sim.events, State_S, linecolor=:black, linewidth=1.5) p3 = plot(mcmc.markov_chains[1].events[niter], State_I, linealpha=0.01, title="I", xguidefontsize=8, yguidefontsize=8, xtickfontsize=7, ytickfontsize=7, titlefontsize=11) for i=(burnin+1):thin:niter plot!(p3, mcmc.markov_chains[1].events[i], State_I, linealpha=0.01) end plot!(p3, sim.events, State_I, linecolor=:black, linewidth=1.5) p4 = plot(mcmc.markov_chains[1].events[niter], State_R, linealpha=0.01, title="R", xguidefontsize=8, yguidefontsize=8, xtickfontsize=7, ytickfontsize=7, titlefontsize=11) for i=(burnin+1):thin:niter plot!(p4, mcmc.markov_chains[1].events[i], State_R, linealpha=0.01) end plot!(p4, sim.events, State_R, linecolor=:black, linewidth=1.5) # Combine l = @layout [a; [b c d]] combinedplots2 = plot(p1, p2, p3, p4, layout=l)
Finally, we can plot out the inferred transmission network compared to the true network.
p1 = plot(sim.transmission_network, sim.population, title="True Transmission\nNetwork", titlefontsize=11, framestyle=:box) tnp = TransmissionNetworkPosterior(mcmc, burnin=burnin, thin=thin) p2 = plot(tnp, sim.population, title="Transmission Network\nPosterior Distribution", titlefontsize=11, framestyle=:box) combinedplots3 = plot(p1, p2, layout=(1, 2))